N Semantic Classes are Harder than Two

نویسندگان

Ben Carterette

Rosie Jones

Wiley Greiner

Cory Barr

چکیده

We show that we can automatically classify semantically related phrases into 10 classes. Classification robustness is improved by training with multiple sources of evidence, including within-document cooccurrence, HTML markup, syntactic relationships in sentences, substitutability in query logs, and string similarity. Our work provides a benchmark for automatic n-way classification into WordNet’s semantic classes, both on a TREC news corpus and on a corpus of substitutable search query phrases.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The two be's of English

This qualitative study investigates the uses of be in Contemporary English. Based on this study, one easy claim and one more difficult claim are proposed. The easy claim is that the traditional distinction between be as a lexical verb and be as an auxiliary is faulty. In particular, 'copular-be', traditionally considered to be a lexical verb, is in fact a prototypi...

متن کامل

On the Role of Derivational Processes in the Formation of Non-Taxonomic Classes of Lexical Units in Russian

The paper is focused on classes of lexical units which arise as a result of derivational processes – word formation and semantic transfers, acting either in isolation or together, on the basis of common semantic foundations that bind targets and sources of derivation. The lexical items which constitute the classes under study vary in their denotative characteristics and due to their categ...

متن کامل

Controlled Bootstrapping of Lexico-semantic Classes as a Bridge between Paradigmatic and Syntagmatic Knowledge: Methodology and Evaluation

Semantic classification of words is a highly context sensitive and somewhat moving target, hard to deal with and even harder to evaluate on an objective basis. In this paper we suggest a step–wise methodology for automatic acquisition of lexico–semantic classes and delve into the non trivial issue of how results should be evaluated against a top–down reference standard.

متن کامل

Estimation of Variance of Normal Distribution using Ranked Set Sampling

Introduction In some biological, environmental or ecological studies, there are situations in which obtaining exact measurements of sample units are much harder than ranking them in a set of small size without referring to their precise values. In these situations, ranked set sampling (RSS), proposed by McIntyre (1952), can be regarded as an alternative to the usual simple random sampling ...

متن کامل

Why Is Simulation Harder than Bisimulation?

Why is deciding simulation preorder (and simulation equivalence) computationally harder than deciding bisimulation equivalence on almost all known classes of processes? We try to answer this question by describing two general methods that can be used to construct direct one-to-one polynomial-time reductions from bisimulation equivalence to simulation preorder (and simulation equivalence). These...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2006

N Semantic Classes are Harder than Two

نویسندگان

چکیده

منابع مشابه

The two be's of English

On the Role of Derivational Processes in the Formation of Non-Taxonomic Classes of Lexical Units in Russian

Controlled Bootstrapping of Lexico-semantic Classes as a Bridge between Paradigmatic and Syntagmatic Knowledge: Methodology and Evaluation

Estimation of Variance of Normal Distribution using Ranked Set Sampling

Why Is Simulation Harder than Bisimulation?

عنوان ژورنال:

اشتراک گذاری